Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RAG] Agentic RAG workflow on tabular data from a PDF file #376

Merged
merged 63 commits into from
Jan 8, 2025

Conversation

AgentGenie
Copy link
Collaborator

Why are these changes needed?

Introduce a workflow to extract accurate tabular data information from a pdf file.
This is one key component to support agentic rag on general pdf file.

Related issue number

Related to #68

Checks

@marklysze
Copy link
Collaborator

marklysze commented Jan 8, 2025

Thanks @AgentGenie, nice work on getting a workflow to successfully extract data from a table image within a PDF!

A couple of notes:

  • can you please add the requirement to install with the neo4j extra: pip install ag2[neo4j]
  • For the last bullet point in the top section can you please indicate (in better words than mine :) ) that the agentic workflow uses a RAG agent is used to extract document metadata (image of table data based on table name), table image to markdown through a multi-modal agent, and an LLM to answer the question.
  • For the Parse PDF file section, I know it has a note on price but you please update to include an estimated cost (e.g. you noted $15USD) to run. Similarly, it mentions to skip unless you insist, but can you tell them where to skip to (I think the cell starting with # IMPORTS)
  • It would be good to ask a second question in the final run (which I tried and it worked well)

Thanks!

@AgentGenie
Copy link
Collaborator Author

@marklysze Addressed the comments. Pre-commit from my local looks fine. I guess it is because git lfs but don't know how to resolve it. Can we skip it?
Screenshot 2025-01-07 at 11 28 02 PM

@marklysze
Copy link
Collaborator

@marklysze Addressed the comments. Pre-commit from my local looks fine. I guess it is because git lfs but don't know how to resolve it. Can we skip it? Screenshot 2025-01-07 at 11 28 02 PM

I'll look at ignoring these in the pre-commit config.

Copy link
Collaborator

@marklysze marklysze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @AgentGenie!

@marklysze marklysze merged commit 4bf5956 into main Jan 8, 2025
221 of 224 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants